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• ' We analyze the thermodynamic properties of a simpHfied model for folded 

^^ . RNA molecules recently studied by G. Vernizzi, H. Orland, A. Zee (in Phys. 

00 \ Rev. Lett. 94 (2005) 168103). The model consists of a chain of one-flavor 

^^ ■ base molecules with a flexible backbone and all possible pairing interactions 

equally allowed. The spatial pseudoknot structure of the model can be effi- 
k>( , ciently studied by introducing a N x N hermitian random matrix model at 

^ ' each chain site, and associating Feynman diagrams of these models to spatial 

configurations of the molecules. We obtain an exact expression for the topo- 
logical expansion of the partition function of the system. We calculate exact 
and asymptotic expressions for the free energy, specific heat, entanglement 
and chemical potential and study their behavior as a function of temperature. 
Our results are consistent with the interpretation of 1/N as being a measure 
of the concentration of Mg"*"^ in solution. 
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The applications of mathematical and statistical mechanics techniques to study 
suitable biological problems has been a successful area of recent research interest 
[1-3]. In particular, the study of the spatial and topological (pseudoknot) structure 
of DNA and RNA molecules is a successful example of the above [4-11]. A RNA 
molecule is a heteropolymer strand made up of four types of nucleotides: uracil 
(f/), adenine (A), guanine (G), and cytosine (C). The sequence of these bases from 
the 5' to the 3' end defines the primary structure of the molecule. In solution, at 
room temperature, different bases can pair with each other by means of saturating 
hydrogen bonds to give the molecule a stable shape in three dimensions, with U 
bonding to A, C to G, and wobble pair G to U, all with different interaction energies. 
This last interaction (non Watson-Crick base pair) together with triplets, quartets, 
etc. has a important role in fold of the RNA molecule [12-15]. The effect of stacking 
interactions also contributes to the stability of the molecule, making sets of adjacent 
bonds twist into the familiar Watson-Crick helices. Among all possible structures 
that arise from interaction between the bases, one defines the secondary structures 
of a RNA molecule as all structures which are represented by planar arc diagrams, 
that is, no crossing of arcs in a representation resembling a Feynman diagram. When 
the diagrams are non-planar, one says that a RNA molecule contains one or more 
pseudoknots (see [16, 17] for a general definition and [21] for a discussion on the 
planarity of the diagrams). Finally, one defines the tertiary structure of RNA as the 
actual spatial three-dimensional arrangement of the base sequence. 

Several methods have been successfully used to study the folding dynamics of 
RNA molecules in various conditions. Some of these are based on statistical me- 
chanics models, which usually avoid the complexities related to the dynamical evo- 
lution of the real world molecules, but allowing for a simple, kinematical treatment 
of the proposed models. Therefore, the study of these models can shed light on the 
intricate molecular dynamics and is our main motivation for considering them. In 
this paper, we study a simplified model of a RNA-like molecule considered in [5] in 
which the geometric degrees of freedom of the system, such as the stiffness and the 
sterical constraints of the chain, are not taken into account. In addition, they con- 
sider that all pairs of bases interact with a common pairing strength (the assumption 
of neglecting disorder along the sequence is actually a classic approximation [22]). 
Moreover, the model keeps the fundamental property of saturation of the interac- 
tions, that is, given a base in the chain it can interact (following the rules mentioned 
above) with only one another base at a time. The study of this model is interesting 
in itself and has motivated some interesting work in the literature, including case of 
the planar diagram limit (no pseudo- knots) [18] [19] [20], and the study of the tertiary 
structure of the RNA molecule (see, for instance [5-9]). Moreover, the model allows 
for some exact calculations including the partition function, the specific heat, and 
some other thermodynamical and physical quantities. Therefore, the study of the 
physical properties of this model could be considered as a first approximation or a 
limit case for more realistic RNA models. A natural extension of the model in [5] 



towards more realistic ones (for example, including different interactions between 
the bases) could be developed by simple modifications of a matrix potential. 

The authors of [5] consider a system of L molecules (nucleotide bases) forming a 
lineal macromolecule with the shape of a chain. They do not describe the formation 
of the backbone, but only the interaction between links of the chain that produce 
the folding of the RNA macromolecule (see Fig.l). Each base can interact through 
an attractive force with any other base of the chain. However, once a given molecule 
has paired with other, it will not interact again with another. In this case, it is said 
that the interaction between these bases saturate. 
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Figure 1: Arc diagram representation of the interacting pairs (1,3) and (2,6) and 
its respective folded diagram. 

Although the bases that form the RNA molecule interact with different pairwise 
energies, a first simple approximation is to consider that this energy is the same for 
all bonds, and that any paring of bases is assumed to be feasible. This amounts 
to considering just one type of base and no further selection rules. Note that if all 
energies are equal, then the Boltzmann factors v = e~'^ (where T is the absolute 
temperature and k is the Boltzmann constant which we will equal to one) are equal 
as well. The configurational partition function of a molecule of size L in the model 
in [5] can be written as 

z(L.7v.r)^-Wi + ^^)^)^ J ; ' ^■''^' . (1) 
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where (/? is a random N x N hermitian matrix and Z depend of T through v. Note 
that the simple form of (1) is only a consequence of the symmetry of the matrix 
potential that reduces the original integration over L matrices to one integration 
over (f [5]. From the theory of random matrices (see for example [26], pag. 140-2) 
it follows that 

iV - 2 

where all averages are performed with respect to the gaussian measure dip e~2L^ "^ . 
Replacing this into (1) and taking into account that {trip'') = for k odd, we arrive 
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where the symbol [L/2] means the integer part of L/2. From (3) we may compute 
(for each V) Z exactly, as a function of iV and T . The large- A^ asymptotic expansion 
of Z has a well-known topological meaning [31]: the power off is the number of arcs 
in the diagram, and the power of 1/iV^ is the genus g of the diagram. It is therefore 
convenient to write (3) in following form: 
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where e^ = ke and 

4(L,iv) = >M - )(';)( -^ 1 o.^;:7....i (5) 



From (4) we see that the spectrum of the system has [L/2] + 1 possible energies 
0, e, 2e, . . . , [L/2]e and the degeneracy of the k — th level is dk{L, 1). For example, 
for L = 7 the maximum energy of a configuration is 3e and there are (is (7, 1) = 105 
different configurations with that energy. Moreover, considering (5) as a function of 
N yields its topological information, e.g., for L = 7, d^{7,N) =35 + 70 l/N'^, which 
means that out of the total 105 configurations with 3e, 35 have genus and 70 have 
genus 1. 

Next, we calculate the partition function in the large N limit, which is the planar 
limit, using the results of [24]. For completeness, we quote here the results relevant 
for ours. We define the resolvent W{p): 

where p is a complex variable, and 

W^ = l(tr^") . (7) 

In the large N limit, the resolvent is given by the solution to the following equation 
called Pastur's equation [25] (in the limit g -^ and c ^ 0, in [24]): 

W^{p)-pW{p) + l = , (8) 

which is: 
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where Ck are the Catalan numbers [27]. From (9) we obtain 
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With (10) we write down the partition function in the large A^ limit. We consider 
the case N = 1 for comparison purposes as well: 

[L/2] 

Z(L,N=1,T) = Y ,;. ,. , v^ (12) 
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Note that both expressions for Z are very similar, except for the factors (A; + 1)! 
and 2^ in the denominators of the expansion coefficients. Noting that the ffist 
factor is larger than the second, we conclude that Z{N -^ oo) < Z{N = 1). The 
interpretation of this result is clear if we recall that, for v = 1, Z{N -^ oo) counts 
the planar diagrams only, whereas Z{N = 1) counts both the planar and non-planar 
diagrams [5-8]. Morevover, we verify that both partitions functions coincide for 
values of L smaller than 3 as they should given that all diagrams are planar in 
these cases. Furthermore, in these two limiting cases, the partition function can be 
written as in terms of hypergeometric functions: 

Z(L, iV = 1, T) = 2Fo(-^, -^ + i; 2t;) , (13) 

Z{L,N^ oo,T) = 2Fi(-^,-^ + l2;Av) , (14) 

where pFq{lt] if; z) = Y1T=q (")'(") f[ ^"^^ ('^)'^ ~ via) ^^^ ^^^ /c-order Pochham- 
mer symbols. We remark here that the results (13) and (14) are exact. 

As we mentioned before, the power of 1/A^^ yields the genus g of the diagram, 
that is, the minimum number of handles of the surface on which the diagram can 
be drawn without crossings. From table of values of Z for the smallest values of L 
in [5] we notice, for instance, that for L = 5 the number of planar diagrams on the 
sphere is 21 and the number of non-planar diagrams that can be drawn on a torus 
without crossings is 5. Next, we would like to write Z{L,N,T) in the form of a 
topological expansion [5,6,31], that is, as a power series in the variable l/N"^: 

oo _. 

Z{L,N,T) = Y.'9iL,T)j;^, (15) 
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where Zg{L, oo) is, for a molecule of size L, the number of planar diagrams in a 
topological surface of genus g. For the example of the previous paragraph, we have 
Zo(5, oo) = 21 and 2:1 (5, oo) = 5. Note that Zg{L,T), as a function of T, is the 
partition function of the system living on the topological surface of genus g. In 
order to bring the partition function to the form (15), we first define the auxiliary 
function: 

This function contains all the N dependence of (3). Below, we write the binomial 
coefficient as 

where S* is the Stirling number of the first kind [27,28] with parameters m,j (in 
turn, we define S* = if m > j or if j < 0). Replacing (17) in (16), we obtain 
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Now, if we want to obtain the 0{1/N^3) of Gk{N) (we indicate this by G^^^\n)), 
we must require that /c + 1 — -m = 2g, then j = k — 2g, k — 2g + 1, . . . ,k. To obtain 
all orders of N we must add all possible values of g: 

00 00 k ^ -. ^j q(k+l-2g) 
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replacing in (3) we obtain (15) with Zg{L,T) given by: 

[i/2] A: ^1 2J-k g{k+l-2g) 

In the limit T — > 00, Zg{L,T) coincides with aL,g of [5]. Using the above mentioned 
property of the Stirling numbers, we see from (20) that the maximum genus of a 
diagram for a given L is [-^/4], therefore g < [L/4]. An analysis of the T-dependent 
phase transition from the topological expansion of the partition function will be 
given in a future publication [32]. 

In the rest of this letter, we analyze the thermodynamic properties implied by 
the partition functions (3) by studying some interesting observables. We start by 



considering the 'entanglement' (non-local two-point correlation function) between 
two bases of the the chain in our model. For that, we use the following definition 
for the correlation between the molecules i and j {i < j) of the chain of size L 



^ ZiJ-^,N,T) 
^''^' ZiL,N,T) ' ^ ^ 



where Z(j — i,N,T) is the partition function for the molecule including bases i 
up to j. For the case of periodic boundary conditions we have Z{j — i,N,T) = 
Z{j — i + 1,N,T). In the low temperature limit the partition function becomes 
independent of A^ (in this limit only configurations up to two bases interacting are 
possible, it is, planar configurations). Therefore yields the same result for both the 
N = 1 and N ^ oo cases: 

with L » 1, X = j -i » 1 (provided f3e » ln(LVl2) + ln(2 + l/N^)). The 
physical behavior of this observable can be obtained by considering the situation 
where x is large, yielding {i,j) — x"^, which can be interpreted as a signal of confine- 
ment (long-range order with critical exponent —2). This behavior is coherent with 
the interpretation of the model as describing a folded RNA molecule. Note that the 
exponent (—2) for the long range order coincides with the value found in [20,23]. 

Going ahead with our study of observables of the model, we now calculate the 
normalized free energy /, or free energy per molecule, in the limit of low temperature: 

/(L, iV, T) = ElLl^ = _^ in(Z(L, N, T)) (23) 

where F is the free energy of the system. For T << 1, both the A^ = 1 and N ^ oo 
cases yield: 

f{f3»l) = -liL-lf-^ (24) 

Furthermore, for the special case of f = | (for which /3 = In 4/e) we obtain 

/(Ar^oo,t; = l/4) = -iln(2) (25) 

Next, we define the chemical potential of the model as: 

,(T NT^ 9F{L,N,T) T dZ{L,N,T) 

The interpretation of fi is the following: we consider that there is a gas of A^ 'par- 
ticles' in the internal space of the random matrices at each site of the chain of size 
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L. The chemical potential measures the response of the system to a change in the 
size N of the matrix 99. On the other hand, the concentrations of secondary and 
tertiary structures can be separated experimentally by varying the concentration of 
Mg "'""'" ions in solution [6,9,33]; in the original model [5], one can assign this role 
of regulation to A^, as it is mentioned in [9] and can be seen from (15) that this 
dependency is how 1/A^ ~ [Mg"*"^]. Therefore, the chemical potential /i can be con- 
sidered as a measure of the influence of the concentration of Mg++ ions in solution 
on the system. From (15) we see that, in the large N limit, Z is 0(1) and dZ/dN 
is 0{1/N^), therefore /x is 0{1/N^) in the form: 

/i(L,iV,T) = I^^T^ + 0(l/iV^) (27) 

In the large N limit, we obtain the partition functions on the sphere and on the 
torus, zq and zi respectively, and write down the chemical potential (see [28] for 
explicit expressions of Sn ) '■ 

/i(^,iV,T)^^(A;3-A;)o, (28) 

where the averages are defined as: 
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and are labelled by the subindex in order to distinguish them from the previously 
defined ones. Using numerical calculations, it can be seen that for T >> 1, {k^)o » 
{k)o and {k^)o is independent of T, then: 

Ml^,iV,T»l)^^T, (30) 

whereas for T << 1, we see that the dependence of /i on T is given by: 

ML,iV,T«l)^i^^Te--/^. (31) 

The limits we have just discussed are summarized graphically in fig. 2. For any value 
of the temperature, the system will tend to configurations with large N, because 
this minimizes the chemical potential. In this regime of A^, the concentration of 
positive ions in solution is small, and the configurations of the molecules will tend 
to be planar. 

The previous expressions for fi lead us to a consistent physical interpretation of 
the parameter N. We recall that [30]: 

— - -^ f32) 

dN~ r ^^^' 
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Figure 2: Chemical potential as a function of the temperature for TV = 100. 



where S is the entropy. From equations (30) and (31), one can see that S oc l/N"^ in 
both T — i> and T ^ oo limits. Therefore, the entropy vanishes for N -^ oo, which 
is also the limit in which the topology of the molecule is spherical (by the topology 
of a molecule, we mean topology of the Feynman diagrams associated with the con- 
figuration of the molecule). This suggests that 5* could be considered as an indicator 
of the spatial topological configurations of the molecule. One can argue that the 
genus of the molecule is largely determined by the conditions of the surrounding 
medium, such as the concentration of Mg"'"+ ions. The competition between the 
interaction of a given base molecule with other molecules in the chain and with the 
ions of the medium regulates the folding of the chain and therefore, its genus. One 
may assume that the concentration of ions in the medium should be monotonous 
functions of 1/A^. Therefore, we arrive to the conclusion that the 'internal' param- 
eter N, introduced by hand as a convenient variable for organizing the topological 
configurations, could be given the physical interpretation of representing the inverse 
quantity of the ion concentration of the medium [6, 9, 21]. 

Next, we consider the specific heat at constant volume (in this case the volume 
is the size of the chain V = L): 



Cv{T) 



-T 



d^F\ 



(33) 



The graph of Cy against the temperature (Fig. 3 (a)) has the particular shape 
characteristic of the system with finite energy levels (see comment after equation 
(5)). The characteristic temperature Teh corresponds to the position of the peak in 
Cv showed in the graphs. For temperatures above and below T^h, the specific heat 
decreases rapidly. This well-known behavior of the specific heat with temperature 
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is known as the Schottky anomaly [29, 30] and it is a general property of systems 
with energy levels with discrete degeneracy (see d^ above). In the low temperature 
region, we have 



^-(^«^-n2^j 



sech^ 



2kT 



In 
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(34) 



In this limit, the specific heat coincides with that for a two- level system, since for 
low enough temperatures, the system will only be able to access the ground state 
and the first excited state. In figure 4 , we plot the exact specific heat from (33) 
and the low-temperature approximation from (34). Furthermore, we may define the 
topological specific heat as: 



C,{T) 



-r(^ 



(35) 



L,9 



where Fg{L,T) = —KT\n{zg{L,T)). Note that Cg can be identified with the specific 
heat restricted to the diagrams on the topological surface of genus g. In Fig. 3 (b) 
we show that the higher peaks correspond to the lower genera. This agrees with the 
intuitive argument that considers a molecule with higher genus as strongly folded 
and, therefore, with reduced number of degrees of freedom. We can carry out the 
same analysis for Cy, given that the addition of a new base molecule increases the 
number of degrees of freedom of the system. In this sense, we can consider the 
relation L ~ 1/g. 




CJT) 




Figure 3: a) Cv{T) for A^ = 10 and L = 100,200,300. b) Cg{T) for L = 100 and 
(7 = 0,10,20. 

In conclusion, we have studied several thermodynamical and topological aspects 
of the simplified model of RNA of [5]. We have presented an exact expression for the 
partition function of the system, and gave an interpretation of the degeneracy of each 
energy level as a function of A^. Furthermore, we have calculated the topological 
expansion of the partition function of the model, in which the coefficients of the 
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Figure 4: Exact and low-temperature approximation for the specific heat with L = 
20 and A^ = 1 

expansion can be interpreted as the reduced partition functions for systems restricted 
to topological surfaces of genus g. We showed that the maximum genus of the 
configurations is [-Z^/4], for a molecule of size L. Moreover, we have calculated 
asymptotic expressions for some thermodynamical observables, as a function of the 
temperature. Analyzing the expressions for the chemical potential and entropy, 
within our data, we find a consistent interpretation relating the variable 1/A^ (arising 
from the matrix model) and the concentration of Mg"*"^, as it has been suggested in 
[6,9]. 

MdE thanks to Matias Reynoso for helpful discussions. 
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